Non-English Response Detection Method for Automated Proficiency Scoring System
نویسندگان
چکیده
This paper presents a method for identifying non-English speech, with the aim of supporting an automated speech proficiency scoring system for non-native speakers. The method uses a popular technique from the language identification domain, a single phone recognizer followed by multiple languagedependent language models. This method determines the language of a speech sample based on the phonotactic differences among languages. The method is intended for use with nonnative English speakers. Therefore, the method must be able to distinguish nonEnglish responses from non-native speakers’ English responses. This makes the task more challenging, as the frequent pronunciation errors of non-native speakers may weaken the phonetic and phonotactic distinction between English responses and non-English responses. In order to address this issue, the speaking rate measure was used to complement the language identification based features in the model. The accuracy of the method was 98%, and there was 45% relative error reduction over a system based on the conventional language identification technique. The model using both feature sets furthermore demonstrated an improvement in accuracy for speakers at all English proficiency levels.
منابع مشابه
Similarity-Based Non-Scorable Response Detection for Automated Speech Scoring
This study provides a method that identifies problematic responses which make automated speech scoring difficult. When automated scoring is used in the context of a high stakes language proficiency assessment, for which the scores are used to make consequential decisions, some test takers may have an incentive to try to game the system in order to artificially inflate their scores. Since many a...
متن کاملAutomatic scoring of non-native children's spoken language proficiency
In this study, we aim to automatically score the spoken responses from an international English assessment targeted to non-native English-speaking children aged 8 years and above. In contrast to most previous studies focusing on scoring of adult non-native English speech, we explored automated scoring of child language assessment. We developed automated scoring models based on a large set of fe...
متن کاملModeling Discourse Coherence for the Automated Scoring of Spontaneous Spoken Responses
This study describes an approach for modeling the discourse coherence of spontaneous spoken responses in the context of automated assessment of non-native speech. Although the measurement of discourse coherence is typically a key metric in human scoring rubrics for assessments of spontaneous spoken language, little prior research has been done to assess a speaker’s coherence in the context of a...
متن کاملAcoustic Feature-based Non-scorable Response Detection for an Automated Speaking Proficiency Assessment
This study provides a method that increases the robustness of automated speech scoring. Responses with sub-optimal characteristics such as background noises, volume problems, nonEnglish speech, whispered speech, and non-responses make automated scoring more difficult. For instance, loud background noises distort the spectral characteristics of speech, and the performance of the prosody and pron...
متن کاملOff-Topic Spoken Response Detection with Word Embeddings
In this study, we developed an automated off-topic response detection system as a supplementary module for an automated proficiency scoring system for non-native English speakers’ spontaneous speech. Given a spoken response, the system first generates an automated transcription using an ASR system trained on non-native speech, and then generates a set of features to assess similarity to the que...
متن کامل